Evaluating Evaluation Measures
نویسندگان
چکیده
This paper presents a thorough examination of the validity of three evaluation measures on parser output. We assess parser performance of an unlexicalised probabilistic parser trained on two German treebanks with different annotation schemes and evaluate parsing results using the PARSEVAL metric, the Leaf-Ancestor metric and a dependency-based evaluation. We reject the claim that the TüBa-D/Z annotation scheme is more adequate then the TIGER scheme for PCFG parsing and show that PARSEVAL should not be used to compare parser performance for parsers trained on treebanks with different annotation schemes. An analysis of specific error types indicates that the dependency-based evaluation is most appropriate to reflect parse quality.
منابع مشابه
Presenting a New Model for Bank’s Supply Chain Performance Evaluating with DEA Solution Approach
Data Envelopment Analysis (DEA) is a method for measuring the efficiency of peer decision making units (DMUs) with multiple inputs and outputs. The traditional DEA treats decision making units under evaluation as black boxes and calculates their efficiencies with first inputs and last outputs. This carries the notion of missing some intermediate measures in the process of changing the inputs to...
متن کاملContemporary methods for evaluating complex project proposals
The ability to evaluate project proposals, assessing future success, and organizational value is critical to overall business performance for most enterprises. Yet, predicting project success is difficult and often unreliable. A four-year field study shows that the effectiveness of available methods for evaluating and selecting large, complex project depends on the specific project type, org...
متن کاملUsing DEMATEL Method to Develop Conceptual Model for Evaluating Green Suppliers
Nowadays stakeholders and public awareness have increased the pressure on companies for environmental issues. Thus, green supply chain management (GSCM) seems vital for companies' environmental compliance and business growth. Companies continuously seek novel ideas and methods, that can enable them to obtain and/or maintain environmental sustainability. Greening the supply chain is one such...
متن کاملUsing DEMATEL Method to Develop Conceptual Model for Evaluating Green Suppliers
Nowadays stakeholders and public awareness have increased the pressure on companies for environmental issues. Thus, green supply chain management (GSCM) seems vital for companies' environmental compliance and business growth. Companies continuously seek novel ideas and methods, that can enable them to obtain and/or maintain environmental sustainability. Greening the supply chain is one such...
متن کاملEvaluating machine translation output with automatic sentence segmentation
This paper presents a novel automatic sentence segmentation method for evaluating machine translation output with possibly erroneous sentence boundaries. The algorithm can process translation hypotheses with segment boundaries which do not correspond to the reference segment boundaries, or a completely unsegmented text stream. Thus, the method is especially useful for evaluating translations of...
متن کاملReview of ranked-based and unranked-based metrics for determining the effectiveness of search engines
Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...
متن کامل